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DETAILED ACTION 



Specification 



1 . The disclosure is objected to because of the following informalities: 

a) On page 10, line 4, "syllables In" should be -syllables. In--. A period is 
missing. 

b) On page 13, line 5, "characterize" should be -characterized-. 

c) On page 23, line 23, "June 6, 200" should be -June 6, 2000--. 
Appropriate correction is required. 

2. The listing of references in the specification is not a proper information disclosure 
statement. 37 CFR 1 .98(b) requires a list of all patents, publications, or other 
information submitted for consideration by the Office, and MPEP § 609 A(1) states, "the 
list may not be incorporated into the specification but must be submitted in a separate 
paper." Therefore, unless the references have been cited by the examiner on form 
PTO-892, they have not been considered. 



3. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 



Claim Rejections - 35 USC § 102 



states. 
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4. Claims 1-3, 8, 11-12, and 15 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Nanjo et al. (U.S. Patent 5,778,361). 

5. In regard to claim 1 and 15, Nanjo et al. discloses a method and a program 
storage device (404) for performing the method steps that is used for managing 
(indexing) a textual database. 

6. The textual data (Fig. 3, document 321 and document 322) are disclosed as text 
files (*.txt) that inherently must have been transcribed into semantic units before 
indexing (column 9, lines 7-1 1 ). 

7. The textual data is stored in a textual database (computer based collection of 
documents) (column 3, lines 7-10). 

8. An index of the textual database is created based on semantic units. As broadly 
recited in the claim, the term "semantic units" has been interpreted as being any unit of 
semantics, i.e. a letter, phoneme, syllable, morpheme, word, or phrase. The textual 
data is indexed by an inverted list (302) that stores the semantic units (indexing terms) 
and references the documents containing each term (column 9, lines 11-15). 

9. In regard to claim 2 and 3, Nanjo et al. discloses that the semantic units used for 
indexing are both syllables and morphemes. In Fig. 7, a flowchart (700) is given that 
shows the steps for indexing input characters. The code in Fig. 7 breaks a string of 
Japanese kanji characters into substrings until the final semantic unit (index term) is the 
final kanji character of the preliminary index term (column 14, lines 26-41 and column 
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15, lines 13-17), A single kanji character can be both a morpheme or a single syllable 
or both. 

10. In regard to claim 8, Nanjo et al. discloses the index is a hierarchical index in 
which a semantic unit (indexing term) points to another mode of data. The index is an 
inverted list (302) with nodes (307 and 308) that point to a leaf structure (309-312). The 
leaf structures point to a centralized list of documents (column 9, lines 32-43). 

11. In regard to claim 1 1 , Nanjo et al. discloses searching the textual database for 
target textual data (query entered into search string box Fig. 2, 203) using the semantic 
unit index (content index) (column 15, lines 31-38 and column 16, lines 41-43). 

12. In regard to claim 12, Nanjo et al. discloses converting a target word into a string 
of semantic units to perform the searching step. In Fig. 8, a flow diagram (800) is given 
in which a target word (string of characters to be searched) is entered (803) (column 15, 
lines 44-45). The target word (string of characters) is converted into semantic units 
(search terms) by checking whether the current character is a separator (81 1 ), or a 
character type transition (819). If either of these conditions is met, a key offset (KO) 
and key limiter (KL) type are entered into a key buffer as a delimiting semantic unit 
(search term) (column 16, lines 1-19 and column 12, lines 43-48). Once the target word 
(string of characters to be searched) has been converted to semantic units (search 
terms), a search is conducted on those semantic units (900). 
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Claim Rejections - 35 USC § 103 



13. The following is a quotation of 35 U.S.C, 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

14. Claims 7, 10, and 14 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Nanjo et al. 

1 5. In regard to claim 7, Nanjo et al. discloses all features of the instant claimed 
invention except transcribing using semantic-unit based stenography. Semantic-unit 
based stenography, as broadly recited in the claim, has been interpreted as an input 
device based on semantic units (e.g. letters). A keyboard is an input device based on 
semantic units and, as is well known in the art, can be used to transcribe textual data. 
Using a keyboard to transcribe data would provide an additional transcribing method 
that could be used if other transcribing methods, such as speech recognition or 
character recognition, were not accurate enough. It would have been obvious to one of 
ordinary skill in the art at the time of invention to modify Nanjo et al. so that semantic- 
unit based stenography was used for transcription in order to provide an additional, 
more accurate transcribing method. 
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16. In regard to claim 10, Nanjo et al. discloses that language symbols can be 
represented by series of bits. Nanjo et al. further discloses that given a particular 
coding scheme, a table can be constructed that translates a given code into an 
appropriate character of the language (column 2, lines 12-15). 

17. Nanjo et al. does not specifically disclose using the table to convert the index of 
textual data into a universal index. Converting the index of the textual data into a 
universal index would allow the textual data to be searched in any language. 

1 8. It would have been obvious to one of ordinary skill in the art at the time of 
invention to convert the index of the textual data into a universal index so that the 
textual data could be searched in any language. 

19. In regard to claim 14, Nanjo et al. discloses that the target textual data (the list of 
objects that satisfy the search criteria) is displayed (Fig. 8, 837) (column 16, lines 32- 
35). 

20. Nanjo et al. does not disclose displaying one semantic unit forward and 
backward for a given length based on a user request. 

21 . It would have been an obvious matter of design choice to modify Nanjo et al. so 
that in addition to the target textual data being displayed, one semantic unit forward and 
backward was also displayed, since the applicant has not disclosed that displaying one 
semantic unit backward and forward solves any stated problem and it appears the 
display function would work equally well with any number of semantic units displayed on 
either side of the target textual data to provide needed context. 
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22. Claims 4-6, 16, and 18-21 rejected under 35 U.S.C. 103(a) as being 
unpatentable over Nanjo et al. in view of Holt et al. (U.S. Patent 5,960,447). 

23. In regard to claims 4-6, Nanjo et al. discloses all the features of the instant 
claimed invention except associating the textual data with audio data wherein the step 
of indexing further comprises indexing the audio data with semantic units, time stamping 
the semantic units, and decoding the textual data with a recognition system utilizing a 
language model based on semantic units. 

24. Holt et al. discloses a tagging and editing system that links textual data (word 
processor file Fig. 2, 60) to an audio file (53). Each semantic unit (word) in the textual 
data (word processor file 60) is indexed in the audio file (column 4, lines 1-18). The 
semantic units (words) are time-stamped (a time code pointing to a particular starting 
point in the audio file) (column 4, lines 5-7). A recognition system (52) receives speech 
as an input from the microphone (50) and transcribes the speech to textual data (text 
words) (column 3, lines 16-20). A speech recognition system typically utilizes a 
language model based on semantic units (e.g. phonemes in a HMM word model). 

25. Adding indexes to textual data transcribed with a recognition system 
corresponding audio to data that is time-stamped, as taught by Holt et al., to a system of 
managing a textual database, as taught by Nanjo et a!., would allow the playback of 
associated audio for each recognized semantic unit, thereby helping in correction and 
proof reading of a textual database, as taught by Holt et al. (column 4, lines 29-31 ). 
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26. It would have been obvious to one of ordinary skill in the art at the time of 
invention to add time-stamped indexes to audio data corresponding to the textual data 
in order to help in the correction and proofreading of a textual database. 

27. In regard to claim 16, Nanjo et al. discloses a system for managing (indexing) a 
textual database (400) that includes a textual database (objects 406, 407, 408, and 409) 
and an index generator (index program 415) that generates an index based on semantic 
units, which indexes the textual database with the corresponding semantic units 
(content index 410) (column 1 1 , lines 38-66 and column 12, lines 1-6). 

28. Nanjo et al. does not disclose a recognition system for transcribing textual data 
into corresponding semantic units. 

29. Holt et al. discloses a recognition system (52) for transcribing textual data into 
corresponding semantic units (words). Combining the system for managing a textual 
database as taught by Nanjo et al. with a recognition system as taught by Holt et al. 
would allow a user to transcribe textual data without having to use a keyboard. 

30. It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Nanjo et al. to include a recognition device in order to transcribe 
textual data without having to use a keyboard. 

31 . In regard to claim 18, Nanjo et al. does not disclose a language model is based 
on semantic units. 
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32. Holt et al. discloses a speech recognition system (52) to transcribe textual data. 
It is well known in the art that a speech recognition system typically utilizes a language 
model based on semantic units (e.g. phonemes in a HMM word model), so using a 
language model based on semantic units to transcribe textual data would be obvious to 
one of ordinary skill in the art at the time of invention in order to increase the amount of 
correctly recognized data. 

33. In regard to claim 19, Nanjo et al. discloses that language symbols can be 
represented by series of bits. Nanjo et al. further discloses that given a particular 
coding scheme, a table can be constructed that translates a given code into an 
appropriate character of the language (column 2, lines 12-15). 

34. Nanjo et al. does not specifically disclose using the table to convert the index of 
textual data into a universal index. Converting the index of the textual data into a 
universal index would allow the textual data to be searched in any language. 

35. It would have been obvious to one of ordinary skill in the art at the time of 
invention to further modify Nanjo et a!., as modified by Holt et al., to convert the index of 
the textual data into a universal index so that the textual data could be searched In any 
language. 

36. In regard to claim 20, Nanjo et al. discloses a query processor (search program 
418) that transforms a query into corresponding semantic units (column 16, lines 1-19 
and column 12, lines 43-48). The search program (418) also acts as a search engine 



Application/Control Number: 09/663,812 Page 10 

Art Unit: 2655 

that searches the textual database based the semantic units corresponding to the 
search query (column 16, lines 40-43). 

37. In regard to claim 21 , Nanjo et al. discloses that word boundaries are 
automatically marked in the search query. If a separator character is found in the query, 
a key offset (KO) and key length (KL) value is^s^tored delimits the boundaries of the word 
(column 16, lines 1-5). 

38. Claims 9 and 17 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Nanjo et al in view of Makhoul et al. (U.S. Patent 5,933,525). Nanjo et al. discloses all 
the features of the instant claimed invention except identifying the type of textual data 
so that the step of transcribing is performed based on the type of textual data identified. 

39. Makhoul et al. discloses an optical character recognition system that is language 
independent. The system can be used to recognize many languages (column 6, lines 
34-43). This would suggest to one of ordinary skill in the art at the time of invention that 
the transcription step would depend on which type of language was recognized. 
Modifying Nanjo et al. to identify the type of textual data and transcribe based on that 
type of textual data would allow the use of orthographic rules for each particular 
language to be used in the recognition process, thereby minimizing the recognition 
search as taught by Makhoul et al. (column 6, lines 8-9 and 21-25). 

40. It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Nanjo et al. to include a step of identifying the type of textual data so 
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that the step of transcribing is performed based on the type of textual data identified in 
order to minimize the recognition search as taught by Makhoul et al. 

41 . In regard to claim 17, Nanjo et al. discloses all the features of the instant claimed 
invention, except a recognition system for transcribing textual data into corresponding 
semantic units wherein the recognition system comprises an OCR and an AHR. 

42. Makhoul et al. discloses an OCR system that transcribes textual data into 
semantic units (words). Makhoul also discloses that the techniques used for character 
recognition (Hidden Markov Models) have been applied to AHR systems (on-line 
handwriting recognition systems) (column 1 , lines 57-63). Modifying Nanjo et al. to 
include a recognition system comprising an OCR and an AHR would allow for 
alternative ways to transcribe textual data without typing. 

43. It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify Nanjo et al. to include a recognition system comprising an OCR and 
an AHR in order to provide alternative ways for transcribing textual data without typing. 

Conclusion 

44. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. George (U.S. Patent 5,832,478) discloses a method of searching 
an online dictionary wherein a syllable count is used as an additional query parameter. 
Grajski et al. (U.S. Patent 5,577,135) discloses an automatic handwriting recognition 
system. Bradford (U.S. Patent 5,805,747) discloses an optical character recognition 
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system, Kucera (U.S. Patent 4,674,066) discloses a method of breaking a query term 
into its phonetic equivalent to enhance textual database searches. Halstead, Jr. et al. 
(U.S. Patent 5,963,893) discloses a method of identifying words in a Japanese text 
string. Nguyen et al. (A Comparison of Morpheme and Word Based Document 
Retrieval for Asian Languages) discloses that morpheme based indexing provides 
better search results for Asian languages. Mitchell et al. (U.S. Patent 5,892,099) 
discloses a system that links text data to a corresponding audio component in audio 
data. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Brian L Albertalli whose telephone number is (703) 305- 
1817. The examiner can normally be reached on Monday - Friday, 8:30 AM - 5:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Smits can be reached on (703) 305-301 1 . The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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